Why WWW::Mechanize::PhantomJS?
What is WWW::Mechanize::PhantomJS?
Development of WWW::Mechanize::PhantomJS
Applications
Max Maischein
DZ BANK Frankfurt
Deutsche Zentralgenossenschaftsbank
Information management
If I can do it manually
... the computer can repeat it
... correctly every time
Perl (obviously)
Host-Automation (3270, Win32::OLE)
WWW::Mechanize
WWW::Mechanize::Shell (GPW 2002)
WWW::Mechanize::Firefox (2010)
... and now WWW::Mechanize::PhantomJS
Web applications are still on the rise
Applications hold state in the client
Applications rely heavily on Javascript
Javascript is not Perl's strongest side
Javascript::SpiderMonkey by Mike Schili, Thomas Busch on CPAN
Only Javascript, no DOM
Javascript::Engine by Father Chrysostomos/SPROUT on CPAN
Pure Perl, slooow
Recognized platform
Compatible platform
Interactive Platform
WWW::Mechanize::Firefox
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
PhantomJS is Firefox, but without a UI
WWW::Mechanize::Firefox wants a UI window
WWW::Mechanize::Firefox wants to use my browser
PhantomJS is WebKit, but without a UI
PhantomJS
ghostdriver
Selenium::Remote::Driver
WWW::Mechanize::PhantomJS
My program
an extended Interface
of WWW::Mechanize
using PhantomJS as Backend
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: $mech->get('http://act.yapc.eu/ye2014'); 3: $mech->content_as_png();
Normal WWW::Mechanize API
Javascript
CSS selectors (via HTML::Selector::XPath)
XPath selectors
Javascript error messages!
Automate web sites
Integrated JS unit tests
Validate user input using Javascript server-side
Crazy things
Control PhantomJS
01-open-local.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: $mech->get_local('file.html');
Web site usability test
02-dump-links.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: $mech->get_local('link.html'); 3: 4: sleep 5; 5: 6: print $_->get_attribute('href'), 7: "\n\t-> ", 8: $_->get_attribute('innerHTML'), "\n" 9: for $mech->selector('a.download');
Execute Javascript
03-javascript.pl
1: // Javascript 2: 3: 4: 5: " ".join(["Just","another","Perl","Hacker"]);
Execute Javascript
03-javascript.pl
1: # Perl 2: 3: 4: print $mech->eval_in_page(<<'JS'); 5: " ".join(["Just","another","Perl","Hacker"]); 6: JS
Chat application
Javascript+Perl
Server-Sent Events
Tests
05-screenshot-online.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: my $url= 'http://mychat.dyn.datenzoo.de:5000'; 3: print "Loading $url\n"; 4: $mech->get($url); 5: 6: show_screen;
06-send-chat.pl
1: $mech->get($url); 2: 3: sleep 5; 4: # Set username 5: $mech->eval_in_page(<<'JS', $name); 6: ...
06-send-chat.pl
1: $mech->get($url); 2: 3: sleep 5; 4: # Set username 5: $mech->eval_in_page(<<'JS', $name); 6: (function(name) { 7: set_username(name); 8: })(arguments[0]); 9: JS 10: sleep 1;
06-send-chat.pl
1: # Send chat 2: $mech->eval_in_page(<<'JS', $msg); 3: (function(msg) { 4: $("#message").val( msg ); 5: post_chat( document.createEvent('UIEvent') ); 6: })(arguments[0]); 7: JS
06-send-chat.pl
1: http://www.youtube.com/v/pir_PJmOz8Q 2: 3: https://twitter.com/cpan_pevans/status/503239001101586432 4: 5: http://i.qkme.me/3pvsb6.jpg
07-screenshot-pdf.pl
1: my $mech = WWW::Mechanize::PhantomJS->new(); 2: my $url= 'http://localhost:5000'; 3: print "Loading $url\n"; 4: $mech->get($url); 5: 6: $mech->render_content( 7: format => 'pdf', 8: filename => 'screen.pdf' 9: );
PhantomJS
ghostdriver (included with module)
Patches for Ghostdriver to circumvent Selenium restrictions (included)
WWW::Mechanize
Selenium::Driver::Remote
API implementation (->post()
, ...)
API extensions
Documentation
->post()
Custom HTTP headers (->agent()
, ... )
Easy functions implemented first
Selenium is "User simulation" only
Selenium has no ->post()
function
->post()
function half-implemented
Did not yet need it
Define an API for
browser windows (open, close, popup)
Frames (bad Selenium support)
Alerts (window.alert()
)
Downloads
Event API? Callback API?
List of things that happened since the last call?
Documentation for the module API
WWW::Mechanize::PhantomJS
Documentation to answer questions
WWW::Mechanize::PhantomJS::Examples
WWW::Mechanize::PhantomJS::Troubleshooting
Adapt ::Firefox documentation
WWW::Mechanize::PhantomJS::Examples
WWW::Mechanize::PhantomJS::Troubleshooting
WWW::Mechanize::PhantomJS::Installation
(A)synchronous event model
Asynchronous communication (AnyEvent)
Less Selenium
Less mandatory configuration (ports, ...)
1: PhantomJS Firefox 2: 3: Display No Yes
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard 8: Dialogs Possible Hard
1: PhantomJS Firefox 2: 3: Display No Yes 4: Cookies 5: persistent No Yes 6: Custom 7: certificates Easy Hard 8: Dialoge Possible Hard 9: alert() Possible Hard
The Good
Existing test suite of WWW::Mechanize::Firefox
Existing API of WWW::Mechanize
Experience with ::Firefox
32bit App, 64bit Perl -> TCP!
The Good, the Bad
Selenium is ONLY for Browser"interaction"
Selenium doesn't like frames
Hacks for ghostdriver-API
No communication with ghostdriver developers
The Good, the Bad, the Ugly
API coverage through tests
Subtle differences between ::Firefox und ::PhantomJS
100% pass until
1: s/::Firefox/::PhantomJS/g
All sample code will be on CPAN as
WWW::Mechanize::PhantomJS::Examples
Questions?
Questions?
Slides available at
WWW::Mechanize::PhantomJS on CPAN
... tbd ...
Questions?
Slides at
WWW::Mechanize::PhantomJS on CPAN